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Introduction 

A single study can rarely provide a generalisable and definitive answer to a research 
question focussed within the social sciences (Cooper, 1989; Hunter, Schmidt & Jackson, 1982; 
McGaw, 1997). Results of a single study are frequently influenced by sampling characteristics such 
as the sample population, study setting, and timing. The research environment is often difficult to 
control and human behaviour complex to explain (Wolf, 1986). In many areas, particularly 
Education, economic constraints may restrict the scale of any single study (Draper et al., 1992). As 
a consequence, the comprehensive investigation of an area, such as internet-based courses, may 
require the combination of results from several individual studies. 

More causal factors of a particular effect are likely to be detected by a research synthesis 
than by a single study (Cook et al., 1992). Often different individual studies provide conflicting 
results which can have confusing implications (Wolf, 1986). Knowledge in the social sciences, 
therefore, should progress by recognising the generalisable trends and underlying principles across a 
large body of empirical studies (Niemi, 1986). Synthesis of primary research is also important to 
transmit the accumulated knowledge to lay persons and to determine the direction of subsequent 
research, policies and practice (Cooper & Rosenthal, 1980; Sandelowski, Docherty & Emden, 



1997). 

Research review plays an important role in dissemination of knowledge and in shaping 
further research and practice. Therefore the methodology of research synthesis is crucial (Glass, 
McGaw & Smith, 1981; Dunkin, 1996). Contemporary methods of research synthesis include 
traditional narrative reviews, meta-analyses, best-evidence syntheses and methods of synthesising 
qualitative research. When rigidly followed, none of these contemporary methods can 
comprehensively review research in any specific area of interest. This paper will highlight the 
relative strengths and weaknesses of the contemporary methods of research synthesis and propose a 
multi-stage approach to research synthesis that draws on the strengths of each of these individual 
methods. In this approach, the decisions at every step of the synthesis process will be guided by the 

nature of the data. 



A Critique of Contemporary Methods of Research Synthesis 

Traditional Narrative Reviews of Research 
Traditional literature reviews are often narrative reports of an intuitive aggregate of 
individual research findings (Johnson, 1989). Good traditional narrative reviews can synthesise 
individual research studies, both quantitative studies and qualitative studies, into a conceptually 
meaningful product. These reviews are flexible in their methodology and can be undertaken 
effectively by an experienced research reviewer. But this flexibility can be associated with a high 
level of subjectivity that may explain inconsistencies in the conclusions of different reviews on the 
same issue. The criteria for the inclusion of particular studies in a narrative review have not always 
been made sufficiently clear which makes it difficult for the reader to fully appreciate the effect ot 
the reviewer’s theoretical position on the review's findings. Different primary research studies may 
use different methodologies and precision levels, which in turn are handled differently by different 

reviewers (McGaw, 1997). 



Traditional narrative reviews are often inconclusive, especially when the review includes 
several individual findings supporting conflicting hypotheses or contradictory narratives. 

Therefore, when compared to statistical procedures, traditional narrative reviews are more inclined 
to have type II errors (Cooper & Rosenthal, 1980). These reviews usually ignore unpublished 
research, which in turn, introduces a publication bias (Glass, McGaw & Smith, 1981). Thus, at 
times different traditional reviews may even consistently misrepresent the literature: by failing to 
diagnose a significant effect size or by being biased in favour of published research (Wolf, 1986). 
Sometimes these reviews use a "voting method" to determine if an effect exists. In a voting 
method, all the findings are divided into three categories: those with statistically significant results 
in one direction; those with statistically significant results in the opposite direction; and those with 
statistically insignificant results. This method tends to give equal weight to studies with different 
sample sizes and effect sizes at varying significance levels, resulting in misleading conclusions. No 
matter what conclusion is reached, a major problem remains to determine the size of the effect 
(Abrami, Chambers, Poulsen, De Simone, d'Apollonia & Howden, 1995; Hunter, Schmidt & 
Jackson, 1982). Further, these methods often fail to identify the variables, or study characteristics, 
that could moderate the effect (McGaw, 1997). 

Meta-analysis 

Glass (1976) argued that variability and uncertainty of data in research synthesis are as 
evident as in the data analysis of primary research. Hence research synthesis requires the same 
statistical rigour as is demanded in the data analysis of an empirical study. With these views in 
mind, he proposed a statistical method of research integration that he called "meta-analysis". Meta- 
analysis is the quantitative integration and analysis of the findings from all the empirical studies 
relevant to an issue and amenable to quantitative aggregation. It not only quantifies the effect of a 
treatment, but also identifies potential moderator variables of the effect. In a meta-analysis, 
findings from different studies are expressed in terms of a common metric called the effect size. In 
general, the effect size is the difference between the means of the experimental and control 
conditions divided by the standard deviation (Glass, 1976; Wolf, 1986). 

Meta-analysis has several advantages over traditional narrative review. It not only shows 
the direction of the effect of a treatment, but also quantifies the effect and identifies the moderator 
variables. It includes all the quantitative empirical studies relevant to the research question and 
should be free from the subjectivity introduced by selective sampling. The criteria used for 
selecting the findings included in the synthesis are explicitly stated to remove any unstated 
ambiguity (Hunter, Schmidt & Jackson, 1982). Meta-analysis can provide a general conclusive 
answer to a question (Glass, McGaw & Smith, 1981). Jt is sufficiently robust to deal with a large 
number of empirical studies (McGaw, 1997). 

However, meta-analyses are not free from criticisms. They can overgeneralise, include 
results from poorly designed studies, be biased in favour of published research in comparison to 
unpublished research, give more weight to studies with multiple results and ignore studies for 
which the effect size cannot be computed. In particular, qualitative studies are inevitably excluded 
from such research syntheses (Slavin, 1986). 

Best-evidence Synthesis 

To overcome the limitations of the methods of traditional narrative review and meta- 
analysis Slavin (1986) proposed the method of "best-evidence synthesis" which, in theory, draws on 
the strengths of the methods of traditional narrative review as well as meta-analysis. According to 
Slavin, best-evidence syntheses incorporate the statistical rigour of meta-analyses to synthesise 
quantitative findings together with the flexibility of traditional narrative reviews. The method is 



freed from unacknowledged subjectivity by including well-justified and well-described inclusion 
criteria for empirical studies (Slavin, 1986). 

The method of best-evidence synthesis does not prescribe a rigid set of criteria for selecting 
the empirical studies. Like traditional narrative reviews, best-evidence syntheses allow for the 
individual differences in priorities from review to review. Like meta-analyses, best-evidence 
syntheses explicitly state the criteria for including or excluding the individual research reports. 
Best-evidence syntheses do not exclude all the studies for which computation of the effect size is 
not possible. Unlike meta-analyses, best-evidence syntheses are not limited to statistical 
agg r eg a ti° n and analysis of only quantitative findings from individual studies. In this method, 
statistical analysis is supplemented with a rich literature review which explains any discrepancies 
observed and summarises the results which cannot be quantified (Slavin, 1986). 

A closer inspection of best-evidence syntheses reveals some major differences in the meta- 
analytic aspect of Slavin's method and the contemporarily acceptable meta-analytic procedures. For 
instance, unlike meta-analyses, best-evidence syntheses take the median effect size rather than the 
appropriately weighted mean effect size as the pooled effect size (Slavin, 1986; Veenman, 1995). 
While Slavin's modifications are rarely referenced in the meta-analytic literature, contemporary 
meta-analytic procedures have undergone rigorous criticisms and modifications, as evident in the 
vast literature on different aspects of meta-analysis. Slavin's method also fails to provide guidelines 
for systematic and rigorous methods of synthesising qualitative research. 

Synthesis of Qualitative Research 

Qualitative researchers argue that synthesis of qualitative research should be interpretive 
rather than aggregative. While preserving the integrity and holism of individual studies, inductive 
and interpretive techniques should be used to sufficiently summarise the findings of individual 
studies into a product of practical value (Jensen & Allen, 1996; Noblit & Hare, 1988; Sandelowski, 

1997; Sandelowski, Docherty, & Emden, 1997). 

According to Jensen and Allen (1996), an interpretive synthesis is essentially a reciprocal 
translation of key metaphors of each study in terms of the key metaphors of other studies (described 
in next section). Hence, they argue that an interpretive synthesis should include studies that use 
similar methodologies only. However, reciprocal translational synthesis is only one of the possible 
types of interpretive synthesis. "Refutational" synthesis and "lines of argument" synthesis are two 
other forms of interpretive synthesis advocated by Noblit & Hare (1988) (described in next section). 
The criteria for inclusion of individual studies should be based on conceptual considerations rather 
than only methodological considerations (Noblit & Hare, 1988; Sandelowski, Docherty & Emden, 

1997). 

The purpose of an interpretive synthesis of qualitative research is not to generate predictive 
theories but to facilitate a fuller understanding of the phenomenon, context or culture under 
consideration (Jensen & Allen, 1996; Sandelowski, 1997). It is our contention that policy making 
should be informed not only by quantitative research findings, but also qualitative research findings. 

Principal Argument of This Paper 

Traditional narrative reviews, meta-analyses, best-evidence syntheses, and qualitative 
research syntheses have their own strengths and weaknesses. This paper argues that a 
comprehensive research synthesis should include quantitative as well as qualitative research 
findings. A key assertion of this paper is that the process of synthesising research should be 
inductive and interpretive rather than a rigid set of procedures and techniques. 



A Multi-Stage Approach of Synthesising Quantitative and Qualitative Findings 

Several criticisms of each of the above mentioned methods of synthesis are not specific to 
traditional narrative reviews, meta-analyses, best-evidence syntheses or interpretive syntheses per 
se, but can be generalised to every research synthesis method (McGaw, 1997; Sandelowski, 1997). 
Likewise, issues of rigour at various stages of synthesis are often similar across different methods 
of research synthesis. Instead of arbitrarily excluding any body of literature because of its 
methodological paradigm, a good research synthesis should comprehensively include qualitative as 
well as quantitative findings. The quantitative and qualitative approaches should be complementary 
rather than adversarial. 

In educational research, the researcher often does not have control over all the variables. 
Leinhardt and Leinhardt (1997), therefore, emphasise the importance of exploratory data analysis. 
They urge educational researchers to immerse themselves in their data and let the procedures for 
analysis be guided by the nature of the data, before performing inferential statistics. 

This paper argues that a similar inductive approach is required not only in the preliminary 
data analysis of primary research, but also in the synthesis of results from individual primary 
research studies. The notion of inductive analysis is not exclusive to the synthesis of qualitative 
findings. The approach is equally applicable in the meta-analytic synthesis of quantitative findings 
where the selection of particular statistical techniques for data analysis should be determined by the 
nature of the data, rather than any rigidly prescribed rules. For example, the spread of individual 
effect sizes should be examined before deciding whether to use parametric or non-parametric tests 
for statistical analyses (McGaw, 1997). Consistent with the spirit of exploratory data analysis, the 
preliminary data analysis may also be enriched through the use of graphs and visual displays of data 
(Light & Pillemer, 1984). Figure 1 illustrates a schematic diagram of the multi-stage approach of 
research synthesis as proposed in this paper. 

Inclusion Criteria 

The criteria for inclusion of individual studies should be conceptual. Good research studies 
should not be excluded just because they do not follow a particular methodological paradigm. All 
the individual primary research studies relevant to the particular context, concept, culture, or 
strategy under examination should be included in the synthesis. 

Open Coding and Categorisation of Studies into Sets and Subsets 

Each selected report should first be coded using an open coding scheme for substantive 
variables, the nature of reported data, and the findings relevant to the purpose of research synthesis. 
These reports should then be categorised into sets with, similar research focuses. Studies within 
each set should further be categorised into subsets using similar methodological paradigms. 
Findings within each subset should be synthesised using meta-analytic, aggregative, or reciprocal 
translational methods of synthesis. Synthesis results of each subset should be synthesised across 
the subsets within each set. Synthesis results of each set should then be synthesised across the sets 
using an inductive and interpretive approach. At every level the relationship between the studies 
within a group should decide the nature of the synthesis process and the synthesis product at that 
level. 



Dialectical and Hermeneutic Approach 

At each stage, the synthesis process should be dialectical and hermeneutic (Jensen & Allen, 
1996) as illustrated in Figure 2. Let Set 1 be a collection of studies that investigates the relative 
preferences of students enrolled in online courses for teacher control versus student control. Within 
this set, let there be three subsets. Subset 1.1 is a collection of quantitative findings that can be 
synthesised using meta-analytic procedures. Subset 1.2 includes qualitative findings from 



Figure 1: Schematic diagram of the multi-stage approach proposed in this paper 
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Figure 2: Dialectical and Hermeneutic Synthetic Process 
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participant observations and Subset 1 .3 includes findings from open-ended surveys conducted on 
students. In this case, synthesis of findings within individual subsets could be influenced by the 
synthesis results of other subsets. For instance, synthesis results from participant observations of 
Subset 1.2 could explain the contradictions in the quantitative findings from individual studies in 
Subset 1.1. Synthesis results of survey findings from Subset 1.3 may facilitate the synthesis process 
of findings from participant observations of Subset 1.2 by providing possible explanations for 
differences in key findings from different narratives. Researcher’s synthetic interpretation of all the 
reports in Set 1 is the synthesis product of the synthesis products of individual Subsets within the 
Set 1 . This synthetic interpretation may further modify the synthesis products of individual subsets, 
which in turn, may influence the synthesis product of the set. Thus, the synthesis process at this 
stage is dialectical rather than sequential. 

This dialectical approach should be followed even when synthesising the findings across 
individual sets. For example, let Set 2 be a collection of studies that examine common features of 
the students enrolled in online courses. Synthesis product of Set 1 may indicate that online students 
prefer more student control, rather than teacher control, in their online learning experiences. A 
closer examination of the demographic features of the online learners from the findings of Set 2 
may further suggest that a majority of online learners are mature-age individuals who are working 
full-time and have to constantly meet the demands made by their professional lives. This could 
provide a possible explanation to the synthesis results of the Set 1. Researcher’s synthetic 
interpretation of all the reports is the synthesis product of the synthesis products of findings within 
individual sets. The synthesis process at this stage should also be dialectic and hermeneutic. 

Qualitative Synthesis 

The challenge in synthesising qualitative research lies in summarising the reports in a usable 
format, while preserving the integrity and holism of individual reports (Jensen & Allen, 1996; 

Noblit & Hare, 1988; Sandelowski, 1997; Sandelowski, Docherty, & Emden, 1997). The synthesis 
of qualitative findings should be inductive, hermeneutic and eclectic process at every stage where 
the nature of the synthesis product should be guided by the question under consideration and the 
relationships between the findings and methodological positioning of individual reports. The 
following sub-sections illustrate some of the techniques that could be used in the synthesis process. 

Content Analysis of themes 

Content analysis of themes could be used to identify the major questions that have been 
addressed in the research literature available on a particular field of interest. Content analysis is a 
technique for systematic and quantitative analysis of the manifest content of communication to 
make valid and replicable inferences. In this method, first of all a decision is made about the unit of 
analysis. This unit can be a string of text or a theme (Anderson, 1997; Tesch, 1990). As the 
purpose of a research synthesis is unlikely to be a semiotic investigation, the unit of analysis could 
be key themes or findings emerging from the data. Various categories, major themes in case of a 
research synthesis, should be identified as they emerge from the data. These categories should then 
be operationally defined such that they are mutually exclusive. Using frequency counts of the 
occurrences of individual categories, the research synthesist can identify the areas of research that 
have been thoroughly examined and those areas that need further examination. 

Phenomenography 

Phenomenography is a systematic method of examining the various ways in which people 
perceive, understand, experience, or conceptualise a particular phenomenon (Marton & Chaiklin, 
1994; Marton, 1997). The main assumptions underlying a phenomenographic study are that: 



* There are a limited number of qualitatively different ways in which any phenomenon is 
conceptualised by different individuals. The conceptions of different individuals about the same 
phenomenon can therefore be classified into a finite number of "categories of description". 

* The categories thus formed can frequently be logically related in a hierarchical structure referred 
to as "outcome space". 

Although most phenomenographic studies use interview data, sometimes other forms of 
data, such as documents, observations, written responses, can also lend themselves to a 
phenomenographic study. In a research synthesis, sections of research papers that address the 
conceptions of a particular phenomenon among the informants could be treated as responses to 
open-ended questions and could be subjected to a phenomenographic analysis. 

The analysis of data in a phenomenographic study begins with decontextualisation. The 
boundaries of individual responses are removed and the entire data set is classified into categories 
by looking for similarities and contrasts. This data is then put back into context to identify the 
relation between various responses within the same category. These categories are further 
examined to find the logical relationships between individual categories and study continuous 
variations across categories. 

Reciprocal Translational Synthesis 

To synthesise reports within a subset of similar findings and methodological paradigms, 
"reciprocal translational synthesis" (Noblit & Hare, 1988) could be used. This method assumes that 
the individual reports are addressing similar issues and can be integrated. To begin with, the key 
metaphors, themes, perspectives, or concepts emerging from individual reports that can capture the 
essence of that report in a reduced form are identified. The findings of each report are then tested 
for their abilities to translate the findings of other reports. Thus we select those terms or findings 
that can more succinctly describe the findings of all the reports within the subset. At times, the 
terms employed in individual reports may not be suitable to portray concisely all the reports. In 
those cases, new terms could be introduced that adequately describe the major findings from all the 
reports. 

Refutational Synthesis 

When individual reports give conflicting representations of the same phenomenon, they are 
not amenable to a reciprocal translational synthesis. These reports lend themselves to a 
"refutational synthesis" (Noblit & Hare, 1988) where the relationship between individual studies 
and the refutations become the focus of synthesis process. This process begins with the 
identification of key findings of individual reports followed by an examination of the relationships 
between individual reports. The contradictions between individual reports may be explicit or 
implicit. The implicit refutations are made explicit using an interpretive approach. New metaphors 
are created to explain the key refutations. These metaphors are then used to explain the 
contradictions between other reports. 

Lines-of -argument Synthesis 

At some level, if the individual reports examine different aspects of the same phenomenon, 
"lines-of-argument" synthesis method (Noblit & Hare, 1988) could be used. The main purpose of a 
"lines-of-argument" synthesis is to make inferences. In this method, findings from individual 
reports are used as pixels to get a fuller picture of the phenomenon at hand. The method involves a 
grounded theory approach for open-coding and identifying the categories emerging from the data. 
The key categories that are more powerful in representing the entire data-set are identified by 
constant comparisons between individual accounts. These categories are then linked interpretively 
to create a holistic account of the whole phenomenon. 
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Quantitative Synthesis 

The quantitative findings could be synthesised using meta-analytic procedures. The 
methods of meta-analytic synthesis have been well established. The Handbook of Research 
Synthesis (Cooper & Hedges, 1994) is a particularly good collection of contributions from experts 
in the field of meta-analysis that comprehensively deals with different aspects of meta-analysis. 
Following are the main stages of the process of meta-analysis. 

Identifying and Coding the Variables 

A coding sheet is developed to record the demographic and substantive variables and key 
findings of individual reports. Each report is then closely examined to fill in the relevant 
information in the coding form. The next subsection describes the procedure for calculating an 
Effect Size to quantify individual findings on a common metric. 

Calculating the Effect Sizes for Individual Studies 

Effect size is the measure of the magnitude of the effect of an independent variable on the 
dependent variable or a measure of the relationship between two variables (Rosenthal, 1994). In 
Education, we often use the standardised difference between the means of the experimental group 
and the control group as a measure of the effect size. 

Algebraically 

( M e - M c ) / SD 

where M e is the mean of the experimental group, M c is the mean of the control group and 
SD is the standard deviation (Glass, McGaw & Smith, 1981). The effect sizes calculated thus, 
referred to as g-statistics, often give biased estimation of population effect size, especially for 
studies with small samples (Hedges & Olkin, 1985). To remove this bias, each g-statistic should be 
converted to the metric termed d-statistic. The d-statistic can be computed from the g-statistic using 
the following formula: 

, 3 

d = J(N-2)g where 7(m)= 1-- r and m is an integer. 

Am- I 

Rosenthal (1994) and Fleiss (1994) provide a detailed discussion of the applications of 
different formulae for effect sizes and the formulae for transformations between different 
representations of the effect size 

Computation of a Composite Effect Size 

To provide an estimate of the central tendency Of the individual effect sizes, a composite 
effect size is computed. This could be the median of individual effect sizes, the median of 
appropriately weighted effect sizes, mean individual effect sizes or the mean of appropriately 
weighted effect sizes. The decision to use a particular measure of central tendency should take into 
consideration several factors such as the nature of the spread of the individual effect sizes or the 
purpose of research synthesis. 

Homogeneity Analysis 

The pooled effect size is not treated as a conclusive result on the subject. An attempt is 
made to explain any marked differences between the pooled effect size and those of the individual 
studies (Slavin, 1986; Hedges & Olkin, 1985). Study outcomes within each category are analysed 
for homogeneity to determine if a single effect size is a good representation of the individual 
studies (Mullen & Rosenthal, 1985; Hedges & Olkin, 1985; Johnson, 1989). 



Within each category the homogeneity statistic between the studies (Q B ) is estimated. Q B is 
assumed to have an approximate chi-square distribution with m-1 degrees of freedom, where m is 
the number of studies within each category. A non-significant value of Q B indicates that the 
outcomes are consistent across the studies. In these cases the composite effect size is taken as a 
conclusive result (Mullen & Rosenthal, 1985; Hedges & Olkin, 1985; Johnson, 1989). 

However, often the Q B value is significant which indicates a considerable inconsistency in 
the study findings. In these cases, the composite effect size does not adequately describe the studies 
since the magnitudes and perhaps the directions of their findings are very different from each other. 
Further analysis is carried out for these studies to account for the differences in outcomes. First, an 
outlier diagnosis is used to isolate the studies with significantly different outcomes from the 
composite effect size (Hedges & Olkin, 1985; Johnson, 1989). Following this isolation, the 
remaining studies are subjected to categorical model testing to identify the potential moderator 
variables of the effect. The next section describes outlier diagnosis and categorical model testing in 
detail. 

Outlier Diagnosis 

At this stage of analysis, the study that reduces the homogeneity statistic, Q B) by the largest 
amount is identified. If the methodology of the outlier markedly differs from the remaining studies, 
then the outlier will be isolated and the difference noted. Once again the remaining studies are 
subjected to an outlier diagnosis. This isolation procedure is carried on until major differences are 
observed in the methodology and aims of the isolated studies from the remaining studies (Johnson, 
1989). This preliminary outlier diagnosis is frequently followed by categorical model testing to 
explain the remaining heterogeneity between the findings of individual studies. 

Categorical Model Testing 

Categorical model testing, which is analogous to analysis of variance (ANOVA), is used to 
account for the heterogeneity of outcomes of different studies. To begin with, the studies are 
divided into subgroups based on a study characteristic. Within each class, composite effect size and 
within group homogeneity statistic, Q w , are estimated. Q w is assumed to have an approximate chi- 
square distribution with k-1 degrees of freedom, where k is the number of studies within each 
subgroup. A non-significant Q w value indicates consistency of outcomes within a class. Along 
with the within group homogeneity statistic (Q w ), the between group homogeneity statistic (Q B ) is 
also estimated. A significant Q B indicates that the study characteristic under consideration 
significantly moderates the effect size (Hedges & Olkin, 1985; Johnson, 1989). 

Dynamic Interplay between Synthesis of Quantitative and Qualitative Findings 

At every stage of the synthesis, the process of synthesising qualitative findings should be 
guided by the synthesis results of quantitative findings and vice-versa. For instance, synthesis 
results of qualitative findings should be used to prepare the coding-sheets for meta-analytic 
procedures for the identification of potential moderator variables of the effect size. The results of 
qualitative synthesis should be tested quantitatively using meta-analytic procedures. Likewise, 
results of meta-analytic procedures should be used to explain the qualitative findings. This 
dynamic interplay between the synthesis of quantitative and qualitative findings should facilitate a 
better understanding of the phenomenon and also increase the level of confidence in the synthesis 
results. 

Summary 

As research reviews play an important role in the dissemination of knowledge and in 
shaping future research and practice, the methodology of research synthesis is crucial. This paper 
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argues that a comprehensive research synthesis should include both quantitative and qualitative 
research findings. The process of research synthesis should be inductive, interpretive, dialectic and 
hermeneutic. 
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